Spark for Python Developers by Amit Nandi
Author:Amit Nandi [Nandi, Amit]
Language: eng
Format: epub, pdf
Publisher: Packt Publishing
Published: 2015-12-23T23:00:00+00:00
Supervised and unsupervised learning
We delve more deeply here in to the traditional machine learning algorithms offered by Spark MLlib. We distinguish between supervised and unsupervised learning depending on whether the data is labeled. We distinguish between categorical or continuous depending on whether the data is discrete or continuous.
The following diagram explains the Spark MLlib supervised and unsupervised machine learning algorithms and preprocessing techniques:
The following supervised and unsupervised MLlib algorithms and preprocessing techniques are currently available in Spark:
Clustering: This is an unsupervised machine learning technique where the data is not labeled. The aim is to extract structure from the data:K-Means: This partitions the data in K distinct clusters
Gaussian Mixture: Clusters are assigned based on the maximum posterior probability of the component
Power Iteration Clustering (PIC): This groups vertices of a graph based on pairwise edge similarities
Latent Dirichlet Allocation (LDA): This is used to group collections of text documents into topics
Streaming K-Means: This means clusters dynamically streaming data using a windowing function on the incoming data
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Deep Learning with Python by François Chollet(12589)
Hello! Python by Anthony Briggs(9926)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(9801)
The Mikado Method by Ola Ellnestam Daniel Brolund(9786)
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(9351)
Dependency Injection in .NET by Mark Seemann(9348)
Hit Refresh by Satya Nadella(8831)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8309)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(7789)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(7771)
Grails in Action by Glen Smith Peter Ledbrook(7705)
The Kubernetes Operator Framework Book by Michael Dame(7705)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(7566)
Exploring Deepfakes by Bryan Lyon and Matt Tora(7502)
Practical Computer Architecture with Python and ARM by Alan Clements(7420)
Implementing Enterprise Observability for Success by Manisha Agrawal and Karun Krishnannair(7399)
Robo-Advisor with Python by Aki Ranin(7375)
Building Low Latency Applications with C++ by Sourav Ghosh(7279)
Svelte with Test-Driven Development by Daniel Irvine(7247)
